DCU at the NTCIR-11 SpokenQuery&Doc Task
نویسندگان
چکیده
We describe DCU’s participation in the NTCIR-11 SpokenQuery&Document task. We participated in the spokenquery spoken content retrieval (SQ-SCR) subtask by using the slide group segments as basic indexing and retrieval units. Our approach integrates normalised prosodic features into a standard BM25 weighting function to increase weights for terms that are prominent in speech. Text queries and relevance assessment data from the NTCIR-10 SpokenDoc-2 passage retrieval task were used to train the prosodic-based models. Evaluation results indicate that our prosodic-based retrieval models do not provide significant improvements over a text-based BM25 model, but suggest that they can be useful for certain queries.
منابع مشابه
Spoken Document Retrieval Experiments for SpokenQuery&Doc at Ryukoku University (RYSDT)
In this paper, we describe spoken document retrieval (SDR) systems in Ryukoku University, which were participated in NTCIR-11 “SpokenQuery&Doc” task. In NTCIR-11 SpokenQuery&Doc task, there are subtasks: “spoken content retrieval (SCR) subtask” and “spoken term detection (STD) subtask”. We participated in the SCR and STD subtasks as team RYSDT. In this paper, our SDR and STD systems are described.
متن کاملDCU at the NTCIR-12 SpokenQuery&Doc-2 Task
We describe DCU’s participation in the NTCIR-12 SpokenQuery&Doc (SQD-2) task. In the context of the slide-group retrieval sub-task, we experiment with a passage retrieval method that re-scores each passage according to the relevance score of the document from which the passage is taken. This is performed by linearly interpolating their relevance scores which are calculated using the Okapi BM25 ...
متن کاملOverview of the NTCIR-11 SpokenQuery&Doc Task
This paper presents an overview of the Spoken Query and Spoken Document retrieval (SpokenQuery&Doc) task at the NTCIR-11Workshop. This task included spoken query driven spoken content retrieval (SQ-SCR) as the main sub-task. With a spoken query driven spoken term detection task (SQSTD) as an additional sub-task. The paper describes details of each sub-task, the data used, the creation of the sp...
متن کاملSopoken Term Detection Based on a Syllable N-gram Index at the NTCIR-11 SpokenQuery&Doc Task
For spoken term detection, it is crucial to consider out-ofvocabulary (OOV) and the mis-recognition of spoken words. Therefore, various sub-word unit based recognition and retrieval methods have been proposed. We also proposed a distant n-gram indexing/retrieval method for spoken queries, which is based on a syllable n-gram and incorporates a distance metric in a syllable lattice. The distance ...
متن کاملOverview of the NTCIR-12 SpokenQuery&Doc-2 Task
This paper presents an overview of the Spoken Query and Spoken Document retrieval (SpokenQuery&Doc-2) task at the NTCIR-12 Workshop. This task included spoken query driven spoken content retrieval (SQ-SCR) and a spoken query driven spoken term detection (SQ-STD) as the two subtasks. The paper describes details of each sub-task, the data used, the creation of the speech recognition systems used ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014